warmup update
AT ask Setups Table 4: Shared hyperparameters for all models, given for each task
Table 4: Shared hyperparameters for all models, given for each task. Hyperparameter Random Walk Algorithm Reddit/BASE Enwik8 Layers 4 4 8 8 Hidden size 256 256 512 512 Head count 4 4 8 8 Dropout rate 0.2 0.2 0.3 0.3 Embed. We provide the hyperparameter setups shared across our models for each task in Table 4. Random Walk We train 4-layer models with a hidden size of 256 and 4 attention heads. Algorithm We train the 4-layer model with a hidden size of 256 and 4 attention heads. Staircase model which was run 5 times.
Industry:
- Materials > Chemicals > Industrial Gases > Liquified Gas (0.37)
- Materials > Chemicals > Commodity Chemicals > Petrochemicals > LNG (0.37)
- Energy > Oil & Gas > Midstream (0.37)